Three methods of intonation modeling

نویسندگان

Ann K. Syrdal

Gregor Möhler

Kurt E. Dusterhoff

Alistair Conkie

Alan W. Black

چکیده

This paper compares di erent methods of generating intonation for an American English Text-to-Speech synthesis system. We look at a primarily rule-based approach and two data-driven approaches. For data-driven modeling we used two separate data sets, each representing a somewhat di erent prosodic style. One database was recordings of a portion of 1989 Wall Street Journal text from the Penn Treebank Project. The second database was recordings of interactive prompts used in telephone network services. Both were read by the same female speaker. Approximately two and one-half hours of speech was phonetically and prosodically segmented and labeled ( rst automatically, and subsequently veri ed manually). The prosodic labeling used ToBI [7] tones and breaks. Three di erent intonation models were compared: (1) a predominantly rule-based model based on ToBI labels [3]; (2) a parametric model using the Tilt approach [8]; and (3) a Vector Quantized model based on an underlying parametric representation [5]. Sentences representative of both prosodic styles were synthesized with each of these models, and were presented to listeners for subjective ratings in a formal listening test. The results of the evaluation are reported.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Parametric modeling of intonation using vector

In this study we propose a data-based approach to intonation modeling using vector quantization. The model is based on an F0 parametrization with an especially designed approximation function. The parameter vectors found are vector quantized with varying codebook sizes. This method is motivated by intonation theories that suggest that pitch accent and boundary phenomena can be described by a di...

متن کامل

Parametric modeling of intonation using vector quantization

متن کامل

Implications of Prosody Modeling for Prosody Recognition

This paper introduces Stem-ML, which is a model of the prosody generation process with an associated description language, and suggests how it may help prosody recognition. We applied Stem-ML modeling to three topics: the modeling of prosodic strengths, intonation types, and noun phrase patterns. Stem-ML parameters derived from )&* contours may have a more consistent relationship with prosodic ...

متن کامل

Modeling Broad Context for Tone Recognition with Conditional Random Fields

We propose a tone recognition approach that employs linearchain Conditional Random Fields (CRF) to model tone variation due to intonation effects. We implement three linearchain CRFs which aim at modeling intonation effects at phrasesentenceand story-level boundaries, where we show that standard recognition techniques degrade and common normalization approaches do not improve. We show that all ...

متن کامل

Mechanisms of Question Intonation in Mandarin

This study investigates mechanisms of question intonation in Mandarin Chinese. Three mechanisms of question intonation have been proposed: an overall higher phrase curve, higher strengths of sentence final tones, and a tone-dependent mechanism that flattens the falling slope of the final falling tone and steepens the rising slope of the final rising tone. The phrase curve and strength mechanism...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1998

Three methods of intonation modeling

نویسندگان

چکیده

منابع مشابه

Parametric modeling of intonation using vector

Parametric modeling of intonation using vector quantization

Implications of Prosody Modeling for Prosody Recognition

Modeling Broad Context for Tone Recognition with Conditional Random Fields

Mechanisms of Question Intonation in Mandarin

عنوان ژورنال:

اشتراک گذاری